58
6
The Nature of Information
surements that overthrows the theoretical framework within which a measurement
was made does actually lead to a change inupper KK. Equation (6.13) formalizes the notion
of quiddity qua essence, comprising substance (upper KK) and properties (upper II). The calcula-
tion ofupper KK will be dealt with in more detail in Chap. 11. As a final remark in this section,
we note that the results of an experiment or observation transmitted elsewhere may
have the same effect on the recipient as if he had carried out the experiment himself.
Problem. Critically scrutinize Fig. 6.1 in the light of the above discussion and attempt
to quantify the information flows.
It often happens that experiments are planned (designed), although when the
level of ignorance is high it is often more fruitful to first “play around”. For example,
Alexander Fleming left some Petri dishes open in his laboratory and observed what
grew on them. Later, a specific experiment was designed to evince the antibacterial
action of Penicillium notatum. The design, incorporating prior information and ways
to eliminate possible confounders, and so forth, embodies structural informationupper KK.
6.2
Constraint
Shannon puts emphasis on the information resulting from selection from a set of pos-
sible alternatives (implying the existence of alternatives)—information can only be
received where there is doubt. Much of the theory of information deals with signals,
which operate on the set of alternatives constituting the recipient’s doubt to yield a
lesser doubt, or even certainty (zero doubt). Thus, the signals themselves have an
information content by virtue of their potential for making selections; the quantity
of information corresponds to the intensity of selection or to the recipient’s surprise
upon receiving the information. upper II from Eq. (6.5) gives the average information con-
tent per symbol; it is a weighted mean of the degree of uncertainty (i.e., freedom of
choice) in choosing a symbol before any choice is made.
If we are writing a piece of prose, and even more so if it is verse, our freedom of
choice of letters is considerably constrained; for example, the probability that “x”
follows “g” in an English text is much lower thanone twenty sixth 1
26 (orone twenty seventh 1
27 if we include, as we should,
the space as a symbol). In other words, the selection of a particular letter depends on
the preceding symbol, or group of preceding symbols. This problem in linguistics
was first investigated by Markov, who encoded a poem of Pushkin’s using a binary
coding scheme admitting consonants (C) or vowels (V). Markov proposed that the
selection of successive symbols C or V no longer depended on their probabilities as
determined by their frequencies (v equals upper V divided by left parenthesis upper V plus upper C right parenthesisv = V/(V + C), whereupper VV andupper CC are, respectively,
the total numbers of vowels and consonants). To every pair of letters left parenthesis upper L Subscript j Baseline comma upper L Subscript k Baseline right parenthesis(L j, Lk) there
corresponds a conditional probabilityp Subscript j kp jk; given thatupper L Subscript jL j has occurred, the probability
of upper L Subscript kLk at the next selection is p Subscript j kp jk. If the initial letter has a probability a Subscript ja j, then the
probability of the sequence left parenthesis upper L Subscript j Baseline comma upper L Subscript k Baseline comma upper L Subscript l Baseline right parenthesis equals a Subscript j Baseline p Subscript j k Baseline p Subscript k l(L j, Lk, Ll) = a j p jk pkl and so forth. The scheme can
be conveniently written in matrix notation: